INTRODUCTION

The Ultimate Fighting Championship (UFC) has been known for entertainment in the form of numerous fighting styles, combining disciplines like boxing, wrestling, Brazilian jiu-jitsu, and many more, for over 30 years to loyal fans all across the world. So many of these fans start watching a fight with a simple question: Who do you think is going to win? While a lot of people predict the winner of the fight based on their gut feeling, bias towards certain fighters, or the highlights they watched in the morning prior, there are so many variables up for evaluation to analyze and statistically predict who the winner of the fight will be.

With so many fight statistics surrounding the UFC, it’s difficult to see how just one variable can determine the outcome of a fight. So, we decided to take a deep dive into a dataset containing almost 200 variables of fight statistics from thousands of fights throughout UFC’s history. By leveraging data science tools and statistical modeling techniques, we were able to go beyond guesswork to look further into the data to uncover the more nuanced, hidden patterns within the UFC fights.

When we decided to look into the dataset of UFC fight statistics, there were countless approaches to be taken. However, we ultimately decided on focusing on two questions, based on an exploratory data analysis of ten potential questions. The first of which is looking into how significantly late-round performance impacts fight outcomes in the UFC, and does evidence suggest systematic bias favoring high-profile fighters? Answering this question could be necessary to determine whether or not there are unfair decisions within UFC fights, which could be found if judges weigh the final rounds of the fight more than the others. Additionally, it could assist fighters with pacing themselves throughout the fight in a way that will benefit them the most during the decision.

For our second question, we decided to focus on the prediction of the winner of the fight, based on the given statistics, like the location of the different strikes. We additionally found that taking into account other factors, such as control time and takedown success, was also helpful in improving the model. This ultimately led to our second question: Can we accurately predict the winner of three-round UFC fights that go to a decision using binary classification models, using per-round strike differentials by location (head, body, leg) and other statistical disparities, such as control time and takedown success? This question could prove effective within the UFC as it could allow for fighters to better prepare in their training to compete against their opponent by using the differentials to determine what aspects of their style they should focus on. Additionally, if we are able to find a model that accurately predicts the winners of the fights based on differentials and elements like control time, instead of being based on judge opinion, UFC decisions could be more fair and consistent.

DATA

We found our dataset after sorting through numerous options on Kaggle. We determined that the dataset would be best for our project because all of our group members held at least some interest in the topic, as well as the fact that the dataset included a multitude of variables that would allow us to explore countless aspects of UFC fights within the variables of the dataset. The creator of the dataset is Alex Magnus, who web scraped the fight statistics directly from the “Completed Events Page” of the UFC website, from July 2016 until November 2024. The dataset was also awarded a 10.00/10.00 for usability, including a 100% in the categories of completeness, credibility, and compatibility.

The original dataset contains 194 different variables, of which one category was about the time format of the fight. Throughout our exploratory data analysis process, we determined that we wanted to focus on only three-round fights that went to decision. We decided to only consider three-round fights that went to the decision to help us standardize our data and specifically look into the fights that were determined by judges. This left us with approximately 4,168 fights in the dataset and allowed us to remove all columns that were referencing variables specific to the fourth and fifth rounds of fights. We found that even after cleaning the dataset to refine it to the variables that we would need to investigate the two proposed questions, there was still adequate data to make the dataset thorough enough for us to be able to further look into. Additionally, the zip file from the Kaggle dataset also included a dataset called 3rd Round Decision Disparities. In this dataset, instead of the individual fighter statistic, there is instead a statistic for each variable for each round that takes the difference between the individual statistics of fighter one and fighter two. This dataset was something that we used further into the project, specifically during method two of the second question of our project. Additionally, this data set included two

Within our table, we have numerous different variables. Some of which include Fighter 1 and Fighter 2, both of which give the names of each of the fighters in that specific fight. Each of the statistical categories are split by fighter and round. For example, Head.F1R1 shows the number of strikes that the first fighter landed on the head in the first round, while Body.F2R2 shows the number of strikes that the second fighter landed in round two. Leg.F1R3 describes the significant strikes taken by fighter one toward the legs of fighter two, during the third round. The Fight Method variable describes what the outcome of the fight ultimately was. An important column in our analysis is the Winner. variable. This variable has 2 possible values depending on the fight outcome. A value of 1 in this column describes when the favored fighter wins, and a value of 0 describes when the unfavored fighter wins. The classification of if a fighter is favored or not falls on the discretion of the UFC. Shown below is a glimpse of the updated, cleaned dataset that we further investigated, specifically the variables described earlier in this paragraph.

Fighter1 Fighter2 Winner. Head.F1R1 Body.F2R2 Leg.F1R3 Fight.Method
Caio Machado Brendson Ribeiro 0 3 4 2 Decision - Split
Aiemann Zahabi Pedro Munhoz 1 24 3 0 Decision - Unanimous
Serhiy Sidey Garrett Armfield 1 8 6 0 Decision - Split
Magomed Ankalaev Aleksandar Rakic 1 7 3 1 Decision - Unanimous
Lerone Murphy Dan Ige 1 11 3 1 Decision - Unanimous
Mateusz Rebecki Myktybek Orolbai 1 15 4 1 Decision - Split
Farid Basharat Victor Hugo 1 11 9 1 Decision - Unanimous
Rinat Fakhretdinov Carlos Leal 1 17 3 4 Decision - Unanimous
Rob Font Kyler Phillips 1 6 5 0 Decision - Unanimous
Charles Johnson Sumudaerji 1 12 2 1 Decision - Unanimous
Important Features Dataset
F1.Winner Total.Non.Sig.Strike.Landed.Disp Total.Non.Sig.Strike.Missed.Disp Total.Sig.Strikes.Landed.Disp Total.Sig.Strikes.Missed.Disp
0 3 0 27 26
1 0 0 25 27
1 1 1 14 -135
1 16 1 14 -13
1 1 0 -3 5
Important Features Dataset
F1.Winner Total.Head.Disp Total.Body.Disp Total.Leg.Disp Total.Knockdowns.Disp
0 -1 17 11 0
1 -1 16 10 0
1 16 4 -6 0
1 18 -4 0 0
1 -17 13 1 0
Important Features Dataset
F1.Winner Total.Distance.Disp Total.Clinch.Disp Total.Ground.Disp Total.Knockdowns.Disp
0 27 0 0 0
1 25 0 0 0
1 15 -1 0 0
1 9 3 2 0
1 -5 3 -1 0

Throughout the exploratory data analysis, one of the questions that we wanted to examine was: Among UFC fights that go the distance with both fighters incorporating striking into how they fight, what strike statistic is the strongest predictor of victory? The purpose of this exploratory analysis was to evaluate whether strike statistics, particularly efficiency-based measures like landed strike proportions, could serve as potential predictors of fight outcomes. By revealing noticeable differences between winners and losers in both total strikes landed and significant strikes landed, this initial analysis helped confirm that striking metrics contained meaningful signals worth investigating further. Although the subsequent modeling process primarily used raw strike counts rather than proportions, this early analysis was still significant, as it provided important initial validation that differences in striking success were predictive of fight outcomes. The observed separation between winners and losers in efficiency-based metrics justified a broader modeling approach that ultimately incorporated both volume and accuracy-based variables. The plot “Significant Strike Landed Mean Proportion for Winners vs. Losers” shows an approximate 0.07 proportional difference between the winners and losers, while the plot “Total Strike Landed Mean Proportion for Winners vs. Losers” holds an approximate 0.08 proportional difference between the winner and loser.

RESULTS

After refining the dataset to include only three-round fights that went to decision, we chose to focus solely on strike statistics in fights where both competitors actively incorporated striking into their game plan. Because there are many ways to win a UFC fight- such as grappling, submissions, and control time- and because non-strike metrics lack clear quality indicators (like significant vs nonsignificant), we isolated fights where striking activity would be one of the primary drivers of the outcome. To maintain consistency, we filtered the data to include fights only where both fighters ranked above the 25th percentile in total strikes thrown. This allowed us to refine our research question: In three-round fights that go to decision and involve active striking from both fighters, does third-round performance significantly influence the final outcome, and is there evidence of systematic bias favoring certain fighters?

To better understand strike volume across rounds, we created a boxplot showing the distribution of total punches thrown per round. This visualization revealed that both the median and variability in striking volume significantly increased in Round 3. This suggested that fighters tend to become more desperate or aggressive in later rounds, leading to a spike in activity. Because of this underlying shift in behavior, a simple logistic regression model comparing “round wins” would inherently overweight Round 3. Thus, to prevent exaggerating the impact of the third round, we engineered average strike statistics across rounds and compared predictions based on mid-fight data (Rounds 1 and 2) versus full-fight data.

We standardized each fighter’s performance across all three rounds by averaging all the strike statistics including knockdowns. Using these averages, we fit multiple logistic regression models, selecting the best model using 10-fold cross-validation to balance Specificity, and Sensitivity. The best model was:

     Winner ~ AvgKnockdownsF1 + AvgKnockdownsF2 + 
     AvgNonSigStrikeMissedF1 + AvgNonSigStrikeMissedF2 + 
     AvgSigStrikeMissedF1 + AvgSigStrikeMissedF2
Predictor Mean_Coefficient Mean_p_value
(Intercept) -0.2383385 0.52634
AvgKnockdownsF1 2.4798101 0.00277
AvgKnockdownsF2 -1.5907901 0.03639
AvgNonSigStrikeMissedF1 0.2609521 0.00000
AvgNonSigStrikeMissedF2 -0.2130074 0.00023
AvgSigStrikeLandedF1 0.1926845 0.00000
AvgSigStrikeLandedF2 -0.1495901 0.00000
AvgSigStrikeMissedF1 -0.0300734 0.00126
AvgSigStrikeMissedF2 0.0108672 0.21881

While the variable AvgSigStrikeMissedF2 had a relatively high mean p-value (0.21881) and would typically be considered statistically insignificant, we chose to retain it in the final model because its paired variable, AvgSigStrikeMissedF1, was significant, and including both predictors together led to improved overall model performance in terms of Sensitivity and Specificity during cross-validation. This model has a Sensitivity of 82.33% and a Specificity of 73.13%, as well as a false positive rate of 26.88% and a false negative rate of 17.67%, as calculated using the confusion matrix below.

After finalizing the model, we generated a new set of engineered variables capturing fighter performance through just Rounds 1 and 2. Using the mean coefficients, we predicted winners based only on mid-fight data and compared these predictions to the full-fight model’s predictions. A “flip” was recorded if the predicted winner based on Rounds 1 and 2 differed from the predicted winner after Round 3 data was incorporated. This comparison allowed us to specifically measure how often the third round changed the projected outcome. Out of the 1,125 fights analyzed, a flip occurred in 92 cases, representing only about 12.8% of the total. Grouping by fighter, we found that nearly all fighters who flipped and won were unique cases, with only Carla Esparza and Cody Stamann flipping outcomes twice. Further investigation into these cases showed that their victories were attributable to genuine late-fight performances rather than favoritism. In Esparza’s cases, she significantly outperformed her opponents in the third round, and Stamann’s fights similarly reflected consistent late-round improvement without evidence of name-based bias.

The visual above demonstrates that the majority of fights that flipped after the third round were already extremely close at the end of Round 2. The distribution is heavily clustered around 0, indicating that most fighters who eventually flipped the prediction had nearly even probabilities of winning before the final round. Rather than massive comebacks, the flips typically resulted from small margins in competitive fights. There were no cases where a fighter with overwhelmingly low or high win probabilities flipped dramatically, supporting the idea that the third round did not disproportionately alter outcomes. Additionally, all flipped fights were associated with unique fighters except for two, further suggesting that late-fight shifts were based on legitimate competitive dynamics rather than superstar bias.

Our second question is: Can we accurately predict the winner of three-round UFC fights that go to a decision using binary classification models, using per-round strike differentials by location (head, body, leg) and other statistical disparities, such as control time and takedown success? To answer our question, we decided to use two different methods and datasets to test different options of answers for this question. Our first method used the original dataset that was described in the Data section. Since this dataset did not have differentials by location, we calculated the differentials ourselves to predict the winner. We used differentials because they capture the true style and tendencies of a fighter by comparing their performance against their opponent, not just their raw totals. This approach allows us to fairly include all types of fight statistics and not just striking, since a fighter who doesn’t grapple much would naturally have negative grappling differentials, which is just as important for the model to recognize as positive striking differentials. Overall, differentials better represent how fighters perform and specialize during a fight.

For our first method, we filtered the original UFC dataset to analyze three-round fights that went to decision. We calculated strike differentials by location for each round, control time differentials, takedown success rates, and strike efficiency percentages. Then we created a correlation matrix to understand the relationships between these variables, as shown below. The correlation matrix showed that strong positive correlations exist between similar metrics across rounds, indicating that fighters who dominate in specific areas maintain that advantage throughout the fight. It also showed that control time differentials strongly correlate with victory, suggesting that fighters who control opponents longer consistently win more decisions. We decided to use a training and testing split of 80% training data and 20% testing data in order to make sure our findings wouldn’t be only applicable to the data we were working with and allow us to generalize our findings. For full logistic regression and stepwise methods, we first fit a model using our method and our training data. Then, we predicted using the model from the training data onto the testing data. We implemented Lasso Logistic Regression to predict UFC fight winners by first transforming the data into matrix format with model.matrix() and standardizing the predictors using scale(), which is essential for Lasso’s penalty to affect all variables equally. We then performed cross-validation with cv.glmnet() using alpha=1 to identify the optimal lambda penalty parameter that balances fit and complexity. With this optimal lambda, we fitted the final Lasso model using glmnet() with family=“binomial” for binary classification. This approach automatically selected the most important fight statistics by shrinking less predictive variables’ coefficients to exactly zero, creating a more strict model while maintaining strong predictive power for determining UFC decision winners. In our Random Forest implementation for predicting UFC fight winners, we first calculated the optimal number of variables to consider at each split (mtry) using the square root of the total predictor count excluding the response variable and then trained the model using the randomForest() function with the fight winner as our target variable. We specified 500 trees (ntree = 500) to ensure model stability and used our calculated mtry value to determine how many variables would be randomly sampled as candidates at each split. By setting importance = TRUE, we enabled the calculation of variable importance measures, which would later help us identify which fight statistics like strike differentials and control time were most influential in predicting the winners of UFC decision fights. We then created a function that changed our predictions to binary values, 0 for any value [0,0.5) and 1 for any value [0.5,1]. Then the function took the newly converted values and created a confusion matrix and calculated both the ROC curve and the AUC. It also returned accuracy, sensitivity, and specificity.

The next step in our investigation process toward answering the above question is to figure out which model fits the data in the best way possible. We decided to look into five different modeling approaches: Full Logistic Regression, Stepwise Selection (Forward), Stepwise Selection (Backward), Lasso Logistic Regression, and Random Forest. The models were then evaluated based on Accuracy, Specificity, Sensitivity, the AUC value, and ROC curves.

The metrics we used to evaluate our model are the Confusion Matrix, Accuracy, Sensitivity, Specificity, ROC curves, AUC values. The confusion matrix in the UFC decision prediction model provides a breakdown of how well the model classified fight outcomes and gives us the main foundation for calculating other metrics. It calculates four main metrics: True Positive values are the fights where the model predicted that the favored fighter would win, and the the favored fighter actually did win. True negative values are the model that predicted the favored fighter would lose, and the favored fighter actually did lose. False positive values are when there are fights where the model predicted that the favored fighter would win, but the favored fighter actually lost. False negatives are where the model predicted that the favored fighter would lose, but the favored fighter actually won. Accuracy measures the overall percentage of correct predictions. The formula for accuracy is (TP + TN) / (TP + TN + FP + FN). Sensitivity calculates of all actual wins, what percentage of wins were correctly predicted by the model. The formula for sensitivity is TP / (TP + FN). Specificity calculates of all actual losses, what percentage were correctly predicted. The formula for specificity is TN / (TN + FP). A Receiver Operating Characteristic (ROC) curve is a graph that shows the diagnostic ability of our classification model as its discrimination threshold varies. The x-axis represents the False Positive Rate, and the y-axis represents the True Positive Rate. Each point on the curve represents a different threshold for classifying a fighter as a winner. Moving along the curve represents trade-offs between correctly identifying more winners by increasing sensitivity and incorrectly classifying more losers as winners by increasing false positives. The Area Under the Curve (AUC) value of the ROC curve is a single numerical value that measures the overall ability of the model to discriminate between winners and losers in UFC fights, regardless of the specific classification threshold that was chosen for evaluation. The AUC represents the probability that our model will rank a randomly chosen winning fighter higher than a randomly chosen losing fighter. A higher AUC value means that if we used your model to rank fighters by their probability of winning, fighters who actually won would consistently receive higher probabilities than fighters who lost.

While comparing the ROC curves, we saw that all five models are closely plotted, showcasing the idea that they have similar performance. This showcases that all of the models were effective in distinguishing the winner and loser between the two fighters. However, when we dove a bit deeper into the individual values of Specificity, Accuracy, and Sensitivity, the full logistic regression was determined to be the best model with its Accuracy value of 0.845, Sensitivity value of 0.824, and Specificity value of 0.86. The table below and its corresponding ROC curve graph show a comparison of these values across all models. We then created a confusion matrix of the best model to evaluate it further.

Model Accuracy Sensitivity Specificity AUC
Full Logistic 0.845 0.824 0.86 0.915
Forward Stepwise 0.84 0.818 0.856 0.914
Backward Stepwise 0.84 0.818 0.856 0.914
Lasso 0.845 0.818 0.865 0.915
Random Forest 0.824 0.761 0.869 0.917

Our next step was to analyze which features were the most important for the best model. We created a data frame of model coefficients, sorting them by absolute value to identify the most influential features, while renaming variables for better readability and identifying whether each feature has a positive or negative effect on winning probability. We then visualize these results using a horizontal bar chart colored by direction of influence (red for negative, green for positive), shown below. Our plot showed that Takedown success in round one (TDSuccess_F1R1) was our most influential variable. This could highlight the importance of early takedown success in securing decision wins. Additionally, strike efficiency in round one (StrikeEff_F2R1) showed the strongest influence with a negative coefficient, indicating that when the second fighter has high strike efficiency in the first round, it counterintuitively decreases their winning probability. This might suggest that fighters who start with high efficiency but cannot maintain it throughout the fight tend to lose. Control Time Differentials across all rounds showed significant positive coefficients, with later rounds having stronger effects than the first round. This aligns with judging criteria that reward effective control and suggests that dominating control time, especially in later rounds, strongly influences decisions of who the winner of the fight is. This shows that maintaining control time is crucial for winning decisions, successfully executing takedowns early in the fight significantly increases winning probability, strike efficiency may matter more than raw strike volume, and a strong performance in later rounds appears more influential than early rounds. This leads to the conclusion that to win by decision, fighters should consider prioritizing control time and takedown success over volume striking.

For our second method, we used a dataset associated with our original dataset on Kaggle that already had calculated differentials for every numeric fight statistic in the dataset, while only including rows of fights that were three rounds and went to decision. Due to the nature of the dataset, striking statistics were counted multiple times. For example, one strike can simultaneously be a shot to the head, a shot on the ground, and a significant strike, and therefore be counted in each column. Because of this, we split the dataset into 3 smaller datasets. Important features were first, which included whether a strike was significant or not and whether it landed or not, with other non-striking variables. Location features were second, which included the area of the body that each strike hit, as well as other non-striking variables. Position features were last, including where each fighter was relative to one another when a strike landed. Next, we created a correlation heatmap of all of the variables in this dataset to determine how the different variables correlated with each other. In our analysis across three correlation matrices, we found that landing more significant strikes, achieving longer control time, completing takedowns, and scoring knockdowns are all positively correlated with winning. Striking effectively to the head, body, and legs, especially with more head strikes, also showed a positive correlation with victory. Fighters who control the fight for longer periods and succeed at distance, clinch, and ground positions tend to win more often, while missing takedowns appear to negatively impact the chances of winning. Our correlation matrices are shown below:

Our next step was to train and test each dataset the same way we did in method 1, with 80% in the training dataset and 20% in the testing dataset. For each of the feature sets, the same five modeling techniques applied in the first method were utilized, including Full Logistic Regression, Forward Stepwise Logistic Regression, Backward Stepwise Logistic Regression, Lasso Regression, and Random Forest. Finally, the models were assessed based on Accuracy, Sensitivity, Specificity, and AUC to determine which model performed the best. We created a function to find the best-performing model for each dataset by calculating an overall score, which is simply the average of Accuracy, Sensitivity, and Specificity. We then selected the top model for each dataset based on this score. After that, we identified the single best model across all datasets by finding the model with the highest overall score. We printed both the best models per dataset and the overall best model, and then visualized their performance using the bar plots shown below. Based on the results, we were able to determine that the models achieved about 81% to 85% accuracy across the board, AUC values generally between 0.89 to 0.92, Sensitivity values generally between 0.76 to 0.81, and Specificity values generally between 0.84 to 0.88. This shows that statistical fight metrics can indeed predict winners successfully around 84% of the time. Additionally, we noticed that all models had better Specificity than Sensitivity, meaning they’re better at predicting when the red-corner fighter will lose than when they’ll win. This suggests the UFC’s fighter placement (red-corner typically being the favorite) might not always align with what the statistics indicate. What’s interesting is that the performance was fairly similar across all three feature sets. This tells us that strike location data, position metrics, and general fight statistics all provide valuable predictive information. It is important to note that of the three datasets, the location dataset provided models with the highest Accuracy, showing that location statistics for striking gave us the best predictions.

Model Performance Comparison Across All Datasets
Dataset Model Accuracy Sensitivity Specificity AUC
Important Features Full Logistic 0.8399 0.8050 0.8649 0.9152
Important Features Forward Stepwise 0.8399 0.8050 0.8649 0.9145
Important Features Backward Stepwise 0.8399 0.8050 0.8649 0.9145
Important Features Lasso 0.8373 0.8050 0.8604 0.9155
Important Features Random Forest 0.8241 0.7925 0.8468 0.9102
Location Features Full Logistic 0.8451 0.7987 0.8784 0.9224
Location Features Forward Stepwise 0.8451 0.7987 0.8784 0.9224
Location Features Backward Stepwise 0.8451 0.7987 0.8784 0.9224
Location Features Lasso 0.8478 0.7987 0.8829 0.9224
Location Features Random Forest 0.8373 0.7862 0.8739 0.9069
Position Features Full Logistic 0.8373 0.7925 0.8694 0.9128
Position Features Forward Stepwise 0.8373 0.7925 0.8694 0.9127
Position Features Backward Stepwise 0.8373 0.7925 0.8694 0.9127
Position Features Lasso 0.8346 0.7862 0.8694 0.9132
Position Features Random Forest 0.8136 0.7673 0.8468 0.8910
Best Performing Models by Dataset
Dataset Model Accuracy Sensitivity Specificity Overall Score
Important Features Full Logistic 0.8399 0.8050 0.8649 0.8366
Location Features Lasso 0.8478 0.7987 0.8829 0.8431
Position Features Full Logistic 0.8373 0.7925 0.8694 0.8330

The final step of testing the second method was to determine the feature importance analysis for the best model across each feature dataset. This part of the code extracts the feature importance by sorting the features by the absolute value of their importance scores (so both strong positive and negative impacts are considered), and finally selects the top 10 most important features to plot. Each horizontal bar represents a different combat statistic, with the length indicating its importance score (0.0 to 0.6) in predicting fight outcomes. The color gradient (from light green to red) also represents importance, with redder colors indicating higher importance. For Position Features, the best model was Full Logistic, with TD.CompletedDisp, RevDisp, and KnockdownsDisp emerging as the top predictors. In the Location Features set, the Lasso model performed best, with TD.CompletedDisp, RevDisp, and Ctrl.TimeMinutesDisp as the most important. For Important Features, Full Logistic again was the top model, highlighting TD.CompletedDisp, RevDisp, and KnockdownsDisp. Across all three datasets, TD.CompletedDisp and RevDisp consistently ranked highest, indicating they are the strongest predictors of the target variable. Of the statistics unique to each model, whether a strike landed or not and if it was significant or not had the least importance, while the strike’s location on the body and position of fighters had similar importance. This shows that just knowing the significance of a strike is not enough context on its own. The results demonstrate that we can use statistical fight metrics to predict UFC decision outcomes with moderate success. Achieving approximately 84% accuracy is valuable, though it still also highlights the inherent unpredictability in fight outcomes. This analysis confirms the hypothesis that strike differentials by location (head, body, leg), alongside position metrics and control time statistics, can serve as meaningful predictors of fight outcomes.

CONCLUSION

When we look to find the answer to the question: How significantly does late-round performance impact fight outcomes in the UFC, and does evidence suggest systematic bias favoring high-profile fighters?, it was necessary to go through many different stages of modeling and visualizations to understand if the last round could be deemed the most necessary to “win” in order to win the overall fight. Ultimately, we were able to determine that while Round 3 performance can influence outcomes in certain fights, it does not disproportionately determine the winner, nor is there strong evidence that superstar fighters receive preferential treatment from judges. The low flip rate of 12.8%, combined with the rarity of fighters flipping multiple outcomes, suggests that consistent performance across all rounds — rather than late surges or name recognition — remains the primary driver of victory.

These findings have important practical implications for UFC fighters and their teams. Rather than relying on a final-round push or attempting to “steal” a fight late, fighters should prioritize building steady advantages throughout all three rounds. Strategies emphasizing consistent striking output, sustained pressure, and round-by-round scoring accumulation are likely to be more successful over time. Understanding that judges tend to evaluate fights holistically, without significant final-round bias, can help fighters plan more balanced approaches to winning on the scorecards, rather than risking it all in the final minutes. While our analysis suggests that third-round performance does not disproportionately impact outcomes, future studies could focus on incorporating detailed round-by-round judges’ scoring data. Access to this more granular information would provide a more comprehensive look at how each round individually contributes to the final decision, addressing limitations in our current dataset where only overall fight outcomes were available.

When it came to answering our second question: Can we accurately predict the winner of three-round UFC fights that go to a decision using binary classification models, using per-round strike differentials by location (head, body, leg) and other statistical disparities, such as control time and takedown success?, we were able to accurately predict the winner to an extent. We were able to determine that, with over an 80% Accuracy rating, our first model, based on the holistic dataset from Kaggle, using full logistic regression, was able to determine who won the fight, based on statistics throughout the fight. Additionally, when we tested the dataset of the disparities of the variables in the three categories of importance, position, and location, we determined that we could use a full logistic regression and lasso models, with 83.7% to 84.7% Accuracy, to determine the winner of the fight, with strike location variables being the most helpful of all strike categories.

By utilizing our above models, we could assist UFC fighters with determining the importance of different fight statistics, when compared to the average of their upcoming opponent, to win a fight. For example, if we notice that one of the opponents usually has fairly high differentials in strikes landed on the legs that lead to victory, the UFC fighter could work in training to defend themselves better to avoid strikes to the legs that their opponent is strong with, helping to prevent a style that their opponent favors. The UFC is all about fighting to your best strengths. Being able to understand and plan in the context of how your opponent fights, and using the data as a precedent can lead to better preparedness, and ultimately, more wins. These findings and models can be used in the future to further test fights that go five rounds and to decision. These fights are normally the biggest fights that draw in crowds and have the most importance, so being able to create ways to help understand what influences the judges’ decisions the most can help champion fighters better prepare winning strategies.